This report demonstrates the different map graphs that can be made to map data in R using various available packages, including ggplot2, ggplotly, maps, and mapdata to name a few. To do so, static and interactive graphs of worldwide COVID-19 data are created. These data are available from the Johns Hopkins University’s GitHub repository for COVID-19 data (Dong, Du, and Gardner (2020)).
Data can be added to maps using the latitude and longitude location associated with each data point. These coordinates are used to locate each data point on a drawn map. The following examples use ggplot2’s borders() function to define the scope and draw the borders of the map.
Data can be plotted to a world map as a scatterplot, using the size of each point as indication of the relative abundance of the variable. For example, this graph shows the worldwide number of confirmed COVID-19 cases on April 2, 2020 -
The same can be done using a map of the US. The following is a plot of number of confirmed COVID-19 cases on April 5, 2020 in the US -
R contains a number of options and color palettes, including colorblind-friendly palettes, that can be used to create more beautiful graphs that map the data to both the size and color of the point. The following is a graph of the same US COVID-19 data as above except using a graph design by Anisa Dhane, who used the Viridis palette, a colorblind-friendly color gradient, and some other options -
Data can also be added to shapes on a drawn map (e.g., countries, states) by mapping the data to the name of the region. In R, this involves loading map data into memory that contains the names of the regions in the map, then joining the data to be plotted, which is associated with the same regions, into the same table. Oftentimes, the plotted data region’s must be slightly formatted to match those in the map data for proper plotting. For these types of maps, the color of the fill of each shape typically indicates the relative abundance of the variable. These map data are taken from the map_data R package, which provides a wide range of regional data for creating maps, including the world, countries, and each US state.
For example, the following is a map of the number of confirmed cases of COVID-19 in the contiguous US on April 2, 2020 mapped to the shapes of each state and colored using the Zissou1 gradient from the Wes Anderson palette -
County map data can also be used to plot data, including on top of a larger map. For example, the following graphs plots the data for US COVID-19 confirmed cases on April 2, 2020 for each county on a map of the entire contiguous US -
For this map, the PuRd palette from RColorBrewer is used to create a simple color gradient. RColorBrewer provides a wide range of different color palettes to choose from, including single-color and multi-color gradients and colorblind-friendly options, that can be used to create effective maps.
County map data can be made for single states as well. The following map shows the number of confirmed cases in each Massachusetts county on April 2, 2020 -
Note that the Nantucket and Dukes counties were reported together in the COVID-19 data and 303 confirmed cases were unassigned, so these data are not included in the graph.
Interactive graphs can be made using plotly with map graphs. Like interactive scatterplots, bar graphs, or line graphs, these interactive maps can be manipulated (i.e., zoomed in/out, panned around), and exact data can be seen by hovering over a point or shape on the map. For example, the following graph is the same Massachusetts county data for COVID-19 confirmed cases on April 2, 2020 except made with an interactive graph and using a different color palette (Zissou1 from Wes Andersone palette) -
To demonstrate a more complex example, the following is an interactive graph of the worldwide deaths from COVID-19 by country as of September 26, 2020 -
Summarize the counts for each country in the COVID-19 Deaths graph above and update the graph to 9/26/2020. Unfortunately, I am confused as to how this question is different from the interactive world map directly above since that one already uses the COVID-19 data for Deaths on 26 September 2020. Additionally, it never specifies what count to summarize. However, I did notice that the further explanation mentions using the mean/median latitude and longitude, so maybe it asks to graph the Deaths data as points on the world map. In that case, the following are interactive world maps of the COVID-19 deaths and then confirmed cases by 26 September 2020, where the color and size of the points indicate the relative number of deaths -
Unfortunately, some country’s data are missing (including the US, Canada, UK, and France) because these countries don’t have latitude and longitude coordinates in the daily report data from the repository.
I also noticed that the exercise explanation mentions using the data for the number of confirmed cases, so I have also created an interactive map based on the above filled style for the worldwide COVID-19 confirmed cases by 26 September 2020. I chose a different color palette to distinguish it from the map for COVID-19 deaths and have added a few more countries for a more complete graph -
Update Anisa Dhana’s graph layout of the US to 9/26/2020. For this case, because the number of cases has increased tremendously over the past six months, I adjusted the scale of the size and color for each point from a log scale to a linear scale. Otherwise, the points are overlapping far too much.
Update the “Number of Confirmed Cases by US County” graph to 9/26/2020 and use a different color scheme and/or theme. I have chosen to use the Greys color scheme from RColorBrewer with the classic theme -
Make an interactive plot of a state of your choosing using a theme different from the above examples. I have chosen to display COVID-19 data from Washington since it was the location of the first recorded case in the US. Out of curiosity, I have chosen to plot the number of deaths and confirmed cases by 26 September, 2020. I have chosen different color palettes for each to easily distinguish the two, with Viridis cividis and plasma for the number of deaths and confirmed cases, respectively.
Create a Lab report made to be easily readable by others (friends, family), including hiding warnings, messages, and even code. Include references and a link to the Lab 6 report from your GitHub site. I have created this entire report following this guideline. If one wants to view the code used to create this report, the RMarkdown file can be found on my GitHub repository (Lab6_SNC.Rmd).
Dong, Ensheng, Hongru Du, and Lauren Gardner. 2020. “COVID-19 Data Repository by the Center for Systems Science and Engineering (Csse) at Johns Hopkins University.” 2020. https://github.com/CSSEGISandData/COVID-19.